world model facilitate policy evolution
Recurrent World Models Facilitate Policy Evolution
A generative recurrent neural network is quickly trained in an unsupervised manner to model popular reinforcement learning environments through compressed spatio-temporal representations. The world model's extracted features are fed into compact and simple policies trained by evolution, achieving state of the art results in various environments. We also train our agent entirely inside of an environment generated by its own internal world model, and transfer this policy back into the actual environment. Interactive version of this paper is available at https://worldmodels.github.io
Reviews: Recurrent World Models Facilitate Policy Evolution
Summary: This paper proposes a new way to develop a world model for reinforcement learning. The focus is on the encoding of the visual world, coupled with a world model that learns based on the compressed representation. The world model is a recurrent version of Bishop's (1995, neural networks book, chapter 6) mixture of gaussians network. That network outputs the weights of an MOG (using softmax), the means of the gaussians (linear outputs), and the variance (modeled as e var, so it is a scale parameter). I had not seen a recurrent version of this network before.
Recurrent World Models Facilitate Policy Evolution
Ha, David, Schmidhuber, Jürgen
A generative recurrent neural network is quickly trained in an unsupervised manner to model popular reinforcement learning environments through compressed spatio-temporal representations. The world model's extracted features are fed into compact and simple policies trained by evolution, achieving state of the art results in various environments. We also train our agent entirely inside of an environment generated by its own internal world model, and transfer this policy back into the actual environment. Interactive version of this paper is available at https://worldmodels.github.io Papers published at the Neural Information Processing Systems Conference.